Method of synthesizing cDNA

ABSTRACT

A method for synthesizing cDNA possessing a consecutive sequence starting with a nucleotide adjacent to a cap structure of mRNA, which comprises (i) a process for annealing a double-stranded DNA primer and an RNA mixture containing mRNA possessing a cap structure, (ii) a process for preparing a conjugate of an mRNA/cDNA heteroduplex and a double-stranded DNA primer by synthesizing the first-strand cDNA primed with the double-stranded DNA primer using reverse transcriptase, and (iii) a process for circularizing the conjugate of the mRNA/cDNA heteroduplex and the double-stranded DNA primer by joining the 3′ and 5′ ends of the DNA strand containing cDNA using ligase. This method enables us to efficiently synthesize a full-length cDNA possessing a consecutive sequence starting with a transcription-start-site nucleotide from a small amount of RNA by small processes.

This application is a U.S. national stage of International ApplicationNo. PCT/JP2004/004458 filed Mar. 29, 2004.

TECHNICAL FIELD

The invention of this application relates to a method of cDNA synthesis.More particularly, the invention of this application relates to a novel,simple and high-efficient method for synthesizing cDNA possessing aconsecutive sequence starting with a nucleotide adjacent to a capstructure of mRNA.

BACKGROUND ART

Genome projects have determined almost all sequences of genome DNA(chromosome DNA) that covers all genetic information of variousorganisms including human, mouse, rice, nematoda, yeast and so on. Theentire sequence of these genomes is expected to give us information onthe primary structure of proteins encoded by genes and information onthe expression regulation regions (promoter, enhancer, suppressor etc.)that regulate the expression of the gene. In order to extract these twokinds of information from the genome sequence, the sequence informationof mRNA transcribed from the gene locus of chromosome DNA is crucial. Inorder to analyze the sequence of mRNA, DNA complementary to mRNA(complementary DNA: cDNA) has been usually used. Especially, in order toobtain the foregoing two kinds of information, it is necessary to obtaincDNA (full-length cDNA) synthesized from mRNA that is correctlytranscribed from the gene transcription region and contains an entireprotein-coding region.

Usually, full-length cDNA must meet two requirements. One is to possessa sequence starting with a transcription start site on the genome DNA. A“cap structure” is added to the 5′ end of mRNA that is properlytranscribed from the transcription start site. This cap structure is7-methylguanosin (m⁷G) connected to the transcription-start-sitenucleotide via 5′-5′ triphosphate linkage. The cDNA complementary to themRNA possessing this cap structure meets one requirement for full-lengthcDNA. Another indicator is the presence of a “poly(A) tail” of mRNA.This poly(A) tail is a consecutive sequence of several ten to 200adenines (A) that is added to the 3′ end of mRNA in the nucleus aftertranscription of genome DNA. Therefore, cDNA correctly synthesized froma mRNA template possessing both the cap structure at the 5′ end and thepoly(A) tail at the 3′ end meets the two requirements for full-lengthcDNA (starting with a transcription start site and encompassing anentire protein-coding region).

The cDNA can be synthesized by reverse transcriptase reaction using mRNAas a template, but it is difficult to synthesize full-length cDNA,because mRNA transcribed from chromosome DNA is exposed to variousdegradation reactions in cells or during an extraction process fromcells or during a synthesis process to a DNA strand. The reversetranscription reaction using mRNA as a template synthesizes a DNA strand(the first-strand cDNA) toward the 5′ direction of mRNA from a primeroligonucleotide that is annealed with the 3′ end of mRNA. Thus, when theprimer (oligo dT) is annealed with a poly(A) tail, it is easy to obtaincDNA covering the poly(A) tail. However, this method does not guaranteethe synthesis of full-length cDNA possessing a sequence encompassingfrom the primer to the cap structure, because degradation of mRNA and/orinterruption of synthesis reaction of the DNA strand frequently occur.In fact, most of a vast number of ESTs (expressed sequence tag) reportedso far were derived from incomplete cDNAs generated from degraded mRNAor incomplete cDNAs generated by interruption of synthesis reaction ofthe DNA strand.

Therefore, many methods have been proposed to synthesize full-lengthcDNA possessing a sequence encompassing to the cap structure that existsat the 5′ end of mRNA. These methods are classified into the followingfour main cases based on the used principle.

(1) Tailing Method

This method is based on the addition of a homo-oligomer tail usingterminal transferase to the first-strand cDNA extended to the capstructure. The Okayama-Berg method (Non-patent Document 1) and thePruitt method (Non-patent Document 2) are included in this category.Since it is difficult to strictly control the number of the added tail,this method has a problem that too long tailing makes nucleotidesequence analysis difficult.

The template-switching method (Patent Document 1), that uses a dC tailadded to the 3′ end of the first-strand cDNA by the terminal transferaseactivity of reverse transferase, is also included in this tailingmethod. The number of added dC was described to be 3 to 5 in thereference (Non-patent Document 3).

(2) Linker-ligation Method

This method comprises synthesis of the first-strand cDNA, removal ofmRNA by alkaline or RNase H treatment, and ligation of a single-strandedoligonucleotide linker with known sequence to the 3′ end of thesingle-stranded cDNA using T4 RNA ligase (Non-patent Document 4). Thismethod is inappropriate to prepare a high-quality cDNA library becauseof formation of the secondary structure in the single-stranded cDNA.

(3) Oligo-capping Method

This method is based on the replacement of the cap structure with anoligomer. The methods using an RNA oligomer (Non-patent Document 5) or aDNA-RNA chimeric oligomer (for example, Patent Document 1 by inventorsof this application, Non-patent Document 6) have been reported. Thismethod should produce only full-length cDNAs in principle, but alsoproduces some truncated cDNAs synthesized from degraded mRNAs that areproduced during many processes for treating mRNA, and besides a lot ofpoly(A)⁺ RNA of about 5-10 μg is necessary. The use of total RNA as astarting material to suppress the degradation of mRNA has been reportedto improve the full-length rate to be more than 90%, but the number ofreaction steps unchanged (Patent Document 3).

This method includes the method (Patent Document 3) in which a syntheticoligomer was added to the cap structure after opening its carbohydratering by periodate oxidation reaction.

(4) Cap-trapping Method

This method is based on selecting mRNAs possessing a cap structure andusing them as a template. It includes the method using mRNA selected byanti-cap antibody as a template (Non-patent Document 7) and the methodusing biotinylated mRNA that is prepared by adding biotin to an openring generated by periodate oxidation of the carbohydrate of the capstructure and selecting by avidin-immobilized carrier (Non-patentDocument 8).

-   Patent Document 1: U.S. Pat. No. 5,962,272-   Patent Document 2: 3337748-   Patent Document 3: WO 01/04286-   Patent Document 4: U.S. Pat. No. 6,022,715-   Non-patent Document 1: Okayama, H. and Berg, P. Mol. Cell. Biol.    2:161-170, 1982.-   Non-patent Document 2: Pruitt, S. C. Gene 66:121-134, 1988.-   Non-patent Document 3: CLONTECHniques, July 1997, p. 26.-   Non-patent Document 4: Edwards, J., Delort, J., and Mallet, J.    Nucleic Acids Res. 19:5227-5232, 1991.-   Non-patent Document 5: Maruyama, K. and Sugano, S. Gene 138:171-174,    1994.-   Non-patent Document 6: Kato et al., Gene 150:243-250, 1994.    Non-patent Document 7: Edery, I., Chu, L. L., Sonenberg, N., and    Pelletier, J. Mol. Cell. Biol. 15:3363-3371, 1995.-   Non-patent Document 8: Caminci et al., Genomics 37:327-336, 1996.

DISCLOSURE OF INVENTION

The forgoing any conventional method enables us to synthesizefull-length cDNA. However, even if synthesized cDNAs containsfull-length cDNA at high rates, they inevitably include incomplete cDNAsderived from degraded mRNA and/or incomplete cDNAs produced byinterruption of cDNA synthesis. Therefore, it is necessary to determinewhether or not the synthesized cDNA is derived from full-length mRNApossessing a cap structure. In general, if multiple clones possessingthe same 5′-teminal sequence exist, it is highly possible that theseclones are derived from full-length mRNA, but not conclusive.Especially, in the case of genes possessing multiple transcription startsites, it is very difficult to determine whether the cDNA clone isderived from full-length mRNA or from degraded mRNA lacking a 5′ end.

Therefore, a method for synthesizing cDNA, by which we can synthesizefull-length cDNA at high rates and determine whether or not it possessesa sequence starting with a transcription start site, has been desired.

Also, the foregoing any conventional method has a problem that theyrequire many processes. For example, the oligo-capping method describedin Patent Document 1 is superior with respect to its ability tocertainly synthesize cDNA possessing a nucleotide sequence starting witha cap site, but it requires 8 processes to synthesize cDNA. The increaseof processes causes problems such as the decrease of synthetic yield andthe increases of time, labor, and cost.

Furthermore, some conventional methods contain an amplification processby PCR (Non-patent Documents 4 and 5). Thus, these methods had a problemthat the generated cDNA sequence had artificial mutations because DNApolymerase used for PCR frequently incorporated a nucleotide differentfrom the template nucleotide during polymerase reaction.

Therefore, a method for synthesizing full-length cDNA from a low amountof RNA by small processes without using PCR has been desired.

The invention of this application has done under the foregoingcircumstances, and makes it an object to provide a cDNA synthesis methodsatisfying the following requirements:

-   (1) a starting material is total RNA of one to several micrograms;-   (2) no use of PCR;-   (3) to consist of as small processes as possible;-   (4) to synthesize full-length cDNA that is guaranteed to possess a    consecutive sequence starting with a transcription-start-site    nucleotide in a high yield of more than 90%.

No conventional method satisfies all of these requirements.

The first invention to solve the foregoing subject is a method forsynthesizing cDNA possessing a consecutive sequence starting with anucleotide adjacent to a cap structure of mRNA, which method comprisesthe processes of:

-   (i) annealing a double-stranded DNA primer and an RNA mixture    containing mRNA possessing a cap structure,-   (ii) preparing a conjugate of an mRNA/cDNA heteroduplex and the    double-stranded DNA primer by synthesizing the first-strand cDNA    primed with the double-stranded DNA primer using reverse    transcriptase, and-   (iii) circularizing the conjugate of the mRNA/cDNA heteroduplex and    the double-stranded DNA primer by joining the 3′ and 5′ ends of the    DNA strand containing cDNA using ligase.

In the method of this first invention, a preferred aspect is that mRNApossessing a cap structure is contained in a cell extract, or that mRNApossessing a cap structure is synthesized by in vitro transcription.

Also, in the method of this first invention, a preferred aspect is thatthe primer sequence of the double-stranded DNA primer contains asequence complementary to a partial sequence of mRNA possessing a capstructure or an oligo dT complementary to a poly(A) sequence of mRNApossessing a cap structure.

Furthermore, in the method of this first invention, a preferred aspectis that the ligase is T4 RNA ligase.

In the method of this first invention, another preferred aspect is thatit comprises the following process between the processes (ii) and(iii),:

-   (ii′) generating a 5′-protruding end or a blunt end at the terminal    of the double-stranded DNA primer by cutting the conjugate of the    mRNA/cDNA heteroduplex and the double-stranded DNA primer using a    restriction enzyme.

The second invention is a method for synthesizing cDNA, which comprisesthe following process in addition to the method of the foregoing firstinvention:

-   (iv) synthesizing a second-strand cDNA by replacing an RNA strand    with a DNA strand in the conjugate of the mRNA/cDNA heteroduplex and    the double-stranded DNA primer.

In the method of this second invention, a preferred aspect is that thedouble-stranded DNA primer contains a replication origin or both areplication origin and a promoter for cDNA expression.

The method of this second invention provides a clone containing thesynthesized double-stranded cDNA.

In the method of this second invention, another preferred aspect is toinclude the following process for:

-   (v) incorporating the double-stranded cDNA composed of the    first-strand cDNA and the second-strand cDNA into a vector DNA.    This process enables us to clone the synthesized double-stranded    cDNA into a vector.

The third invention is a cDNA library that is a population of clonescontaining double-stranded cDNA synthesized by the method of theforegoing second invention, of which more than 60% of the cDNA clonespossesses a 5′-end nucleotide of (dT)ndG (n=0-5) followed by aconsecutive sequence starting with a nucleotide adjacent to a capstructure of mRNA.

The forth invention is a method for selecting a cDNA clone possessing aconsecutive sequence starting with a nucleotide adjacent to a capstructure of mRNA, from clones in the cDNA library of the forgoing thirdinvention, wherein a cDNA clone possessing a 5′-end nucleotide of(dT)ndG (n=0-5) is selected as an objective clone.

The fifth invention is a double-stranded DNA primer possessing an oligo(dT)n (n=15-100) as a primer part, in which one terminal part of aprimer side has an 8-base recognition restriction enzyme site RE1, andanother terminal part has an 8-base recognition restriction enzyme siteRE2 and a restriction enzyme site RE3 generating a 5′-protruding end ora blunt end.

A preferred aspect of the fifth invention is that a double-stranded DNAprimer contains a replication origin or both a replication origin and apromoter for cDNA expression. An example of the double-stranded DNAprimer of the fifth invention is a vector primer derived from pGCAP10comprising the nucleotide sequence of SEQ ID NO: 2.

The sixth invention is a reagent kit for cDNA synthesis, which comprisesa double-stranded DNA primer, reverse transcriptase and its reactionbuffer solution, T4 RNA ligase and its reaction buffer solution, andmodel mRNA possessing a cap structure.

The foregoing invention is a method for synthesizing cDNA possessing aconsecutive sequence starting with a nucleotide adjacent to a capstructure of mRNA in high yield, which method comprises at least thefollowing three processes:

-   (i) annealing of a double-stranded primer and mRNA,-   (ii) preparation of a conjugate of a mRNA/cDNA heteroduplex and the    double-stranded DNA primer by synthesizing the first-strand cDNA,-   (iii) joining the 3′ and 5′ ends of a DNA strand containing cDNA in    the conjugate of the mRNA/cDNA heteroduplex and the double-stranded    DNA primer.

This invention has been completed by finding that, when mRNA possessinga cap structure was used as a template and the base of the cap was “G”,“dC” [or 5′-dC(dA)n-3′ (n=1-5)] was added to the 3′ end of thefirst-strand cDNA by the foregoing process (ii). Since “dT” [or5′-dT(dA)n-3′ (n=1-5)] was added to the 3′ end of the first-strand cDNAwhen the base of the cap was “A”, the added nucleotide was shown to havethe base complementary to that of the cap structure. Also, no additionof an extra nucleotide to the 3′ end of the first-strand cDNA wasobserved when RNA not possessing a cap structure was used as a template.Thus, when extra “dG” [or 5′-dG(dT)n-3′ (n=1-5)] exists at the 5′ end ofthe cDNA, we can decide that this cDNA is full-length cDNA derived frommRNA possessing a cap structure. Since more than one extra “dG” can beadded depending on reaction conditions of reverse transcriptase(non-patent reference), if (dN)ndG (dN is dT or dG, n=0-5) exists at the5′ end of the cDNA, generally we can decide that this cDNA isfull-length cDNA derived from mRNA possessing a cap structure. However,when more than one extra “dG” are added, it is difficult to decide which“dG” is extra one. Thus, it is preferable to be performed underconditions for adding one extra “dG” as shown in Examples.

In addition, the addition of 3-5 dCs was described in thetemplate-switching method (non-patent reference 3), but we could notobserve such addition of multiple dCs under conditions in Examples ofthis invention. Also, there is a report that one dC was preferentiallyadded to the 3′ end of the first-strand cDNA by terminaltransferase-like activity of reverse transferase (Schmidt, W. M. andMuller, M. W., Nucleic Acids Res. 27:e31, 1999), but there is no reportabout its mechanism. Furthermore, there is a report that, when reversetranscriptase acted on an RNA/DNA heteroduplex (corresponding to the capstructure-free RNA/DNA heteroduplex in this invention), one nucleotide(“dA” or “dG” or “dC” or “dT”) was added to 90% of the 3′ end of DNA(Chen, D. and Patton, J. T., BioTechniques 30:574-582, 2001), but suchaddition was seldom observed under conditions in Examples of thisinvention.

In this invention, the term “nucleotide” means phosphoester (ATP, CTP,CTP, UTP; or dATP, dGTP, dCTP, dTTP) of nucleoside that contains a sugarlinked to purine or pyrimidine via a beta-N-glycosidic bond. Hereafterthese nucleotides can be described simply by “A”, “G”, “C”, “U”, or“dA”, “dG”, or “dC”, “dT”. The term “complementary” means a pairing ofthe nucleotides via a hydrogen bond; “A” (or “dA”) and “U” (or “dr”), or“G” (or “dG”) and “C” (or “dC”).

The term “double-stranded DNA primer” means double-stranded DNA in whichone end of a DNA strand is a 3′-protruding end whose sequence iscomplementary to that of template mRNA. This protruding part hybridizesthe template mRNA, and works as a primer for the first-strand cDNAsynthesis by reverse transcriptase. In particular, double-stranded DNAwith a replication origin is called a “vector primer”.

Furthermore, in this invention, the term “mRNA possessing a capstructure” means mRNA that is transcribed from genome DNA and whose 5′end is linked by guanosine possessing methylated guanine (G) via a 5′-5′triphosphate bond (mGp5′-5′pp). For example, in the case of a capstructure in which the seventh position of G is methylated, the mRNA hasthe following structure:5′-m⁷GN₁N₂N₃N₄N₅ - - - N_(m)-3′:  (a)

(where N is A, G, C, or U, and m is a positive number of more than 50)

In this invention, the term “cDNA possessing a consecutive sequencestarting with a nucleotide adjacent to a cap structure of mRNA” meansall cDNAs including the following cDNAs.

cDNA (the first-strand cDNA) complementary to a sequence N₁ - - - N_(m)in the foregoing structure (a) of mRNA, wherein the 3′ end is added by5′-dC(dA)_(n)-3′ (n=0-5) (in more general, 5′-dC(dN)_(n)-3′ (where dN isdA or dC, n=0-5)):3′-(dA)ndCdN₁dN₂dN₃dN₄dN₅ - - - dN_(m)-5′:  (b)

(where dN is dA, dG, dC, or dT),

cDNA (the second-strand cDNA) complementary to this cDNA (b):5′-(dT)ndGdN₁dN₂dN₃dN₄dN₅ - - - dN_(m-)3′:  (c),and a cDNA (b)/(c) duplex (double-stranded cDNA). The term “cDNA” simplymeans double-stranded cDNA, but it means the second-strand cDNAdescribed by the foregoing structure (c) in the case of referring to itssequence.

Since the N₁ in the structure (a) of mRNA is a nucleotide correspondingto a transcription start site, “cDNA possessing a consecutive sequencestarting with a nucleotide adjacent to a cap structure of mRNA” can alsobe defined as “cDNA possessing a consecutive sequence starting with atranscription-start-site nucleotide”.

Hereafter “cDNA possessing a consecutive nucleotide starting with anucleotide (a transcription-start-site nucleotide) adjacent to a capstructure of mRNA” can be described as “cap-consecutive cDNA”. Also, inparticular, cap-consecutive cDNA possessing a poly(A) sequence can bedescribed as “full-length cDNA”. Furthermore, cDNA that does not containa nucleotide (at least dN₁ in the structure (b) or (c)) adjacent to acap structure can be described as “cap-nonconsecutive cDNA”. Stillfurthermore, “mRNA possessing a cap structure” can be described as“cap(+)mRNA”, “mRNA not possessing a cap structure” can be described as“cap(−)mRNA”, and mRNA that is produced by removing a cap structure fromcap(+)mRNA can be described as “decapped mRNA”.

Other terms and concepts according to this invention will be defined indetail by referring to the embodiments of the invention and Examples. Inaddition, various techniques to be used for carrying out this inventioncan be easily and surely carried out by those skilled in the art on thebasis of known literatures and the like except for the techniques whosereferences are particularly specified. For example, the techniques ofgenetic engineering and molecular biology of this invention aredescribed in Sambrook and Maniatis, in Molecular Cloning—A laboratoryManual, Cold Spring Harbor Press, New York, 1989; Ausubel, F. M. et al.,Current Protocols in Molecular Biology, John and Wiley & Sons, New York,N.Y. 1995, and the like.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram schematically showing basic processes of thisinvention.

FIG. 2 is a view exemplifying the general structure of a vector primerof this invention.

FIG. 3 is a view schematically showing the structures of pGCAP1 and apGCAP1-derived vector primer of this invention.

FIG. 4 is a view schematically showing the structures of pGCAP10 and apGCAP10-derived vector primer of this invention.

BEST MODE FOR CARRYING OUT THE INVENTION

The first invention is a method for synthesizing cap-consecutive cDNA(the first-strand cDNA), which method indispensably comprises thefollowing processes (i), (ii), and (iii) (refer to FIG. 1).

-   Process (i): to anneal a double-stranded DNA primer and an RNA    mixture containing mRNA possessing a cap structure.-   Process (ii): to prepare a conjugate of an mRNA/cDNA heteroduplex    and the double-stranded DNA primer by synthesizing the first-strand    cDNA primed with the double-stranded DNA primer using reverse    transcriptase.-   Process (iii): to circularize the conjugate of the mRNA/cDNA    heteroduplex and the double-stranded DNA primer by joining the 3′    and 5′ ends of the DNA strand containing cDNA using ligase.

In the process (i), an “RNA mixture containing mRNA possessing a capstructure” may contain either only mRNA possessing a cap structure orothers such as “cap(−)mRNA” and/or “other RNA molecules (for example,rRNA, tRNA etc.)”. Such an RNA mixture may be derived from either singlecellular eukaryotes or multicellular eukaryotes. Also, this RNA mixturemay be either RNA synthesized by in vitro transcription using DNA as atemplate or total RNA extracted from cells. In this process (i),although less than one μg of total RNA can be used as the RNA mixture tosynthesize cDNA, more than one μg of total RNA is preferentially used.Although an mRNA content in total RNA extracted from cells is 2-3%, themethod of this invention enables us to synthesize cap-consecutive cDNAor full-length cDNA from total RNA containing such a low amount of mRNA.

A “double-stranded DNA primer” used in the process (i) has a “primersequence” at its 3′-protruding end. When information on the partialsequence of target mRNA possessing a cap is known, the primer sequencecan be prepared based on this known sequence using known chemicalsynthetic methods (for example, methods described in Carruthers, ColdSpring Harbor Symp. Quant. Biol. 47:411-418, 1982; Adams, J. Am. Chem.Soc. 105:661, 1983; Belousov, Nucleic Acid Res. 25:3440-3444, 1997;Frenkel, Free Radic. Biol. Med. 19:373-380, 1995; Blommers, Biochemistry33:7886-7896, 1994; Narang, Meth. Enzymol. 68:90, 1979; Brown, Meth.Enzymol. 68:109, 1979; Beaucage, Tetra. Lett. 22:1859, 1981; U.S. Pat.No. 4,458,066). Most of known EST (expressed sequence tag) sequenceswere derived from a 3′-end partial sequence of cDNA. Using a primersequence prepared based on these EST sequences, we can obtaincap-consecutive cDNA containing the 5′-upstream region of thecorresponding EST sequence. Also a primer containing an oligo dT that iscomplementary to poly(A) of mRNA can be used. The number of consecutivedT composed of the oligo dT is preferentially 30-70. Using these oligodT primers, cap-consecutive cDNA encompassing to a poly(A) site(full-length cDNA) can be obtained. On the other hand, there is norestriction on the other terminal sequence of the double-stranded DNAprimer so that any double-stranded DNA can be used, but preferentiallyit should possess a linking terminal for joining to the cloning site ofa vector DNA to make it easy to insert into the vector DNA on the laterprocess.

In the process (ii), by acting reverse transcriptase on thedouble-stranded DNA primer annealed with mRNA, a cDNA strandcomplementary to mRNA is synthesized in the 5′ direction of mRNAstarting from the 3′ end of the double-stranded DNA primer. This processproduces a conjugate of an mRNA/cDNA heteroduplex and a double-strandedDNA primer. In this conjugate, one end of the cDNA strand in themRNA/cDNA heteroduplex joins the end of one strand in thedouble-stranded DNA primer. Furthermore, cDNA in the mRNA/cDNAheteroduplex produced by this process (ii) is cap-consecutive cDNA thatis complementary to cap(+)mRNA and possesses dC or 5′-dC(dA)_(n)-3′ atits 3′ end, or cap-nonconsecutive cDNA that is derived from cap(−)mRNA.

As reverse transcriptase, an enzyme that is derived from M-MLV (Moloneymurine leukemia virus) or AMV (avian myeloblastosis virus) can be used,but endogenous RNaseH activity-removed one is preferable.

As ligase in this process (iii), various kinds of DNA ligase or RNAligase of choice can be used, but T4 RNA ligase is preferable. There isa report on a method to join two oligodeoxynucleotides hybridized to RNAusing T4 RNA ligase (U.S. Pat. No. 6,368,801), but there is no report onligation between an mRNA/cDNA heteroduplex and a double-stranded DNA.The ligation is preferably performed after carrying out a process (ii′)by which the end of the double-stranded DNA primer is converted to a5′-protruding end or a blunt end by cutting the conjugate of themRNA/cDNA heteroduplex and the double-stranded DNA with restrictionenzyme. The disadvantage by this increase of one process is compensatedby merits that are decrease of background composed of only vectorwithout a cDNA insert and increase of ligation efficiency. The capstructure of the mRNA/cDNA heteroduplex may or may not be removed, forexample, using tobacco acid pyrophosphatase. Furthermore the ligationmay be performed after degradation of mRNA in the mRNA/cDNA orreplacement of the mRNA strand by a DNA strand. In this case, it shouldbe noted that the degradation product of mRNA generated during thisprocess can be added to the 3′ end of the cDNA.

The foregoing method enables us to synthesize a circular DNA strand thatcontains cap-consecutive cDNA (the first-strand cDNA) described by thefollowing,3′-(dA)ndCdN₁dN₂dN₃dN₄dN₅ - - - Nm-5′:  (b)

The obtained cap-consecutive cDNA provides information to specify atranscription start site of a gene and an expression regulation regionat its upstream by analyzing the sequence of the cDNA and comparing itwith a genome sequence.

The cap-consecutive cDNA (the first-strand cDNA) obtained using theforegoing method is converted to double-stranded cDNA by the process(iv) of the second invention. This process (iv) can be performed byreplacing an RNA strand with a DNA strand, for example, by actingRNaseH, E. coli DNA polymerase I, E. coli DNA ligase, and the like. Thisprocess is not necessary to be done in vitro; for example, if a “vectorprimer” is used as a double-stranded DNA primer, the RNA strand can bereplaced with the DNA strand in cells such as E. coli cells after theligation product is introduced into the cells.

The method of the second invention enables us to clone thedouble-stranded cDNA into a vector by the process (v). For example,after cutting with restriction enzyme sites that are set up in thedouble-stranded DNA, the double-stranded cDNA can be inserted into aplasmid vector or phage vector, and then used for sequencing analysis orproduction of its expression product.

In the method of this second invention, the use of a “vector primer” asa double-stranded DNA primer is preferable because the process forinserting the cDNA into other vector can be omitted. The vector primercan be prepared by cutting at an appropriate site of a circular DNAvector by restriction enzyme and then joining a 3′-end protruding primersequence that is complementary to the part of mRNA. Also, in order tosynthesize full-length cDNA, oligo dT (preferentially 30-70 nucleotides)may be joined at the 3′ end. Especially, the double-stranded DNA primerpossessing the oligo dT as a 3′-protruding end is preferable, forexample, to efficiently prepare a full-length cDNA library. Also, inorder to make it easy to recombine the cut cDNA into other vector, it ispreferable to set up an 8-base recognition restriction enzyme site atthe part of the double-stranded DNA. Furthermore, it is preferable toset up a replication origin at the part of the double-stranded DNA. Asthe replication origin, those that function in a prokaryotic cell suchas E. coli or in eukaryotic cells such as yeast, insect cells, mammaliancells, plant cells and the like can be used. This enables us toreplicate the obtained cDNA vector after introduction into these cells.Furthermore, it is preferable to set up a promoter, a splicing region, apoly(A) addition signal and the like at the part of the double-strandedDNA in order to express the cDNA in vitro by in vitrotranscription/translation or in vivo by introducing into eukaryoticcells.

These double-stranded DNA primers (the fifth invention) may be properlydesigned using an appropriate vector DNA as a starting material, or aknown vector primer such as a pKA1 vector primer (one end possesses a3′-protruding dT tail of about 60 nucleotides, and the other end is anEcoRV blunt end [Kato et al., Gene 150:243-250, 1994]) or the like canbe used. The general structure of the double-stranded vector primer ofthis invention is shown in FIG. 2. This vector primer has 60+/−10 dTs asa primer sequence. The other terminal may be either a blunt end or aprotruding end. It is preferable that the end of a primer side containsan 8-base recognition restriction enzyme site RE1 and that the other endhas an 8-base recognition restriction enzyme site RE2 and a restrictionenzyme site RE3 generating a 5′-protruding end or a blunt end. As an8-base recognition restriction enzyme site, NotI, Sse83871, PacI, SwaI,SfiI, SgrAI, AscI, FseI, PmeI, SrfI or the like can be used. A vectorprimer pGCAP1 prepared in this invention has an AflII site (CTTAAG) asRE3. If the 3′ end of the first-strand cDNA is joined to the5′-protruding end of this AflII ( . . . CTTAA), in the case of“cap-consecutive cDNA”, “dG” is added to the 5′-protruding end,resulting in generation of . . . CTTAAG . . . , that is, restoration ofan AflII site. Thus, the “cap-consecutive cDNA” clones can be cut withAflII. Using this event, cutting with AflII can be used to determinewhether or not cap-consecutive cDNA is synthesized. In addition,MunI(CTTAAG), XhoI(CTCGAG) or the like can be used as a restriction sitethat restores its recognition site by being added with “dG”. The pGCAP10vector primer prepared in this invention possesses NotI as RE1, SwaI asRE2, and EcoRI as RE3.

The third invention of this application is a “cDNA library” composed ofa population of cDNA vectors that is the final product prepared by themethod of the foregoing second invention. As shown in Examples later,the cDNA library is characterized by containing cap-consecutive cDNAclones at extremely high rates, more than 60%, preferentially more than75%, further preferentially more than 90%, most preferentially more than95%.

Accordingly, the cDNA library of this third invention enables us toisolate and analyze cap-consecutive cDNA at high rates without selectingthem, because most of clones in the library are cap-consecutive cDNAs.For purpose of accuracy, the cap-consecutive cDNA can be correctlyselected by the method of the forth invention of this application. Sincethe cap-consecutive cDNA synthesized by the method of the foregoingfirst and second inventions is characterized by the presence of“(dT)ndG” at the 5′ end, we can identify the cap-consecutive cDNA byexamining the presence of this “(dT)ndG” as an indicator withoutdetermining the entire nucleotide sequence of the cDNA. The presence of“(dT)ndG” can be examined using known methods for nucleotide sequencing.Since more than 90% of cap-consecutive cDNAs start with “dG”, thepresence of “dG” can be practically used as an indicator.

The sixth invention of this application is minimum reagents that arenecessary for synthesizing cDNA using the method of this invention, thatis, a cDNA synthesis reagent kit comprising a double-stranded DNAprimer, reverse transcriptase and its reaction buffer solution, T4 RNAligase and its reaction buffer solution, and model RNA possessing a cap.By using this kit, a cDNA library containing cap-consecutive cDNAs canbe easily prepared from a given RNA mixture containing mRNA possessing acap.

EXAMPLES

Hereunder, the invention of this application will be explained in moredetail and specifically by showing Examples; however, the invention ofthis application is not intended to be limited to these Examples.Incidentally, basic procedures and enzymatic reactions related to DNArecombination followed the literature (Sambrook and Maniatis, inMolecular Cloning—A Laboratory Manual, Cold Spring Harbor LaboratoryPress, New York, 1989). With regard to restriction enzymes and variousmodification enzymes, the ones manufactured by Takara Shuzo Co. Ltd.were used unless otherwise particularly stated. Composition of a buffersolution for each enzymatic reaction and reaction conditions followedthe attached instruction.

Example 1 cDNA Synthesis Using Cap Analogue-attached RNA

(1) Preparation of Cap Analogue-attached RNA

A full-length cDNA clone of human elongation factor-1α(EF-1α), pHP00155(Non-patent Document 6), was linearized by digesting with NotI, and thenthis was used as a template to prepare mRNA using an in vitrotranscription kit (Ambion). By adding m⁷G(5′)pppG(5) or A(5′)pppG(5)into the reaction solution as a cap analogue, model mRNA possessing“m⁷G” or “A” as a cap structure was obtained. In addition, by not addingthe cap analogue, model mRNA without a cap structure was obtained. The5′-terminal sequence of the in vitro transcription product is thesequence derived from the vector (5′-GGGAATTCGAGGA-3′) followed by the5′-terminal sequence of EF-1α (5′-CTTTTTCGCAA . . . ).

(2) Synthesis of the First-strand cDNA

The first-strand cDNA complementary to model mRNA was synthesized bymixing 0.3 μg of the forgoing model mRNA and 0.3 μg of a pKA1 vectorprimer (one end has a 3′-end protruding dT tail of about 60 nucleotidesand the other end is an EcoRV blunt end) (Non-patent Document 6) in areaction solution (50 mM Tris-HCl, pH8.3, 75 mM KCl, 3 mM MgCl₂, 5 mMDTF, 1.25 mM dNTP), annealing the model mRNA and the vector, adding 200U of reverse transcriptase SuperScript™ II (Invitrogen) and 40 U ofribonuclease inhibitor (Takara Shuzo), and incubating at 42° C. for 1hour. After the reaction solution was extracted with phenol, a conjugateof cap(+)mRNA/cDNA heteroduplex and a vector primer was recovered byethanol precipitation and then dissolved in 20 μl of water.

(3) Decapping Reaction

The cap structure of mRNA was removed by mixing 20 μl of thecap(+)mRNA/cDNA heteroduplex solution in a reaction solution (50 mMsodium acetate, pH5.5, 5 mM EDTA, 10 mM 2-mercaptoethanol), adding 10 Uof tobacco acid pyrophosphatase (Nippon Gene), and incubating at 37° C.for 30 minutes. After the reaction solution was extracted with phenol, aconjugate of decapped mRNA/cDNA heteroduplex and a vector primer wasrecovered by ethanol precipitation and then dissolved in 20 μl of water.

(4) Self-ligation

The end of the mRNA/cDNA heteroduplex and the EcoRV end of the vectorprimer were ligated and circularized (self-ligation reaction) by mixing20 μl of either the cap(+)mRNA/cDNA heteroduplex solution obtained inthe foregoing (2) or the decapped mRNA/cDNA heteroduplex solutionobtained in the foregoing (3) with a reaction solution (50 mM Tris-HCl,pH7.5, 5 mM MgCl₂, 10 mM 2-mercaptoethanol, 0.5 mM ATP, 2 mM DTT),adding 120 U of T4 RNA ligase (Takara Shuzo), and incubating at 20° C.for 16 hours. After the reaction solution was extracted with phenol, aself-ligation product was recovered by ethanol precipitation and thendissolved in 20 μl of water.

(5) Replacement of RNA Strand with DNA Strand

A vector (cDNA vector) carrying an insert of a cDNA/cDNA duplex wasobtained as a result of synthesizing the second-strand cDNA by replacingan RNA strand with a DNA strand; the replacement reaction was carriedout by mixing 20 μl of the self-ligation product with a reactionsolution (20 mM Tris-HCl, pH7.5, 4 mM MgCl₂, 10 mM (NH₄)₂SO₄, 100 mMKCl, 50 μg/ml BSA, 0.1 mM dNTP), adding 0.3 U of RNaseH (Takara Shuzo),4 U of E. coli DNA polymerase I (Takara Shuzo), and 60 U of E. coli DNAligase (Takara Shuzo), and incubating at 12° C. for 5 hours. After thereaction solution was extracted with phenol, the cDNA vector wasrecovered by ethanol precipitation and then dissolved in 40 μl of TE.

(6) Transformation of E. coli

Transformation was carried out using an electroporation method aftermixing 1 μl of the cDNA vector solution with DH12S competent cells(Invitrogen). The electroporation was carried out using MicroPulser(BioRad). The obtained transformants were suspended in SOC medium,seeded on agar plates containing 100 μg/ml ampicillin, and incubated at37° C. overnight. As a result, a library composed of about 10⁵-10⁶ E.coli transformants was obtained.

(7) Analysis of 5′-end nucleotide sequence of cDNA clones Colonies grownon the agar plate were picked up, suspended in LB medium containing 100μg/ml ampicillin, and incubated at 37° C. overnight. After cells wereharvested from the culture medium by centrifugation, plasmid DNA wasisolated and purified from the cells by the alkaline/SDS method. Thisplasmid was used as a template for a cycle sequencing reaction using akit (BigDye Terminater v3.0, ABI), and the 5′-end nucleotide sequence ofthe cDNA was determined by a fluorescent DNA sequencer (ABI).

When model mRNA prepared using m⁷G(5′)pppG(5) as a cap analogue was usedas a template, 15 clones out of 20 clones carrying a cDNA insertcontained cap-consecutive cDNA. With regard to these clones, extra “dG”(12 clones) or extra “dTdG” (1 clone) not existing in the model mRNA wasadded before the dG of the transcription start site. With regard to theremaining clones, 2 clones had no extra “dG” and 5 clones werecap-nonconsecutive cDNA starting with the middle of mRNA. Incidentally,decapping reaction did not influence the number of grown transformants,the ratio of cDNA starting with the cap site, and the addition of theextra “dG”.

On the other hand, when model mRNA prepared using A(5′)pppG(5) as a capanalogue was used as a template, 18 clones out of 24 clones carrying acDNA insert contained cap-consecutive cDNA. With regard to these clones,extra “dA” (15 clones) or extra “dTdA (1 clone) not existing in themodel mRNA was added before the dG of the transcription start site. Withregard to the remaining clones, 2 clones had no extra “dG” and 6 cloneswere cap-nonconsecutive cDNA starting with the middle of mRNA.

In addition, in both cases, the addition of extra “dG” or “dA” notexisting in model mRNA was not observed in clones possessingcap-nonconsecutive cDNA.

Furthermore, when model mRNA without a cap structure was used as atemplate, 16 clones out of 19 clones carrying a cDNA insert containedthe sequence starting with a transcription start site. Out of them, 14clones did not possess an extra sequence before the dG of thetranscription start site. However, with regard to 2 clones, extra “dT”not existing in the model mRNA was added before the dG of thetranscription start site. With regard to the remaining clones, 3 cloneswere cap-nonconsecutive cDNA starting with the middle of mRNA.

These results suggest that, by using the method of this invention, thefirst-strand cDNA was added by a nucleotide “dC” complementary to a base“G” of a cap structure of mRNA used as a template, and the added 3′-end“dC” of the first-strand cDNA results in the addition of complementary“dG” to the second-strand cDNA. Furthermore, sometimes the complementary“dG” was followed by dT. Therefore the addition of “dG” or “dTdG” to the5′ end of the cDNA indicates that the cDNA is cap-consecutive cDNA.

(8) cDNA Synthesis Using Vector Primer with Protruding End

When self-ligation reaction was performed after synthesizing thefirst-strand cDNA and generating the 5′-protruding end by EcoRI cut ofthe vector primer, a cap-consecutive cDNA clone possessing “dG” at the5′ end was obtained as well as it was done using a vector primer withblunt-end EcoRV. Therefore, the restriction enzyme-cut end of the pKA1vector primer can be not only a blunt end but also a 5′-protruding end.Furthermore, the use of the 5′-protruding end improved the efficiency ofligation and the number of clones composed of the cDNA library ascompared with using the blunt end.

Example 2 Preparation of cDNA Library Using mRNA Derived from CulturedCell HT-1080

Total RNA was prepared from human fibrosarcoma cell line HT-1080(purchased from Dainippon Pharmaceutical Co. Ltd.) using the AGPC method(a kit from Nippon Gene). Poly(A)⁺RNA was purified by binding to abiotinylated oligo(dT) primer (Promega), adding Sreptavidin MagneSphereParticles, and collecting by magnet. A cDNA library was prepared bysynthesizing cDNA under the same conditions as described in Example 1using 0.3 μg of poly(A) ⁺RNA and 0.3 μg of a pKA1 vector primer, andcarrying out transformation of E. coli. As a result, a librarycontaining about 10⁵-10⁶ transformants was obtained. The libraries wereprepared with or without decapping reaction, but there was nosignificant difference between analysis results of the two libraries.Thus, hereafter the results obtained without decapping reaction will bedescribed.

The 5′-end nucleotide sequence of the cDNA was determined using aplasmid isolated from colonies that were randomly selected from theforegoing library. With regard to 191 clones carrying a cDNA insertwhose sequence was determined, BLAST search was performed using GenBanknucleotide sequence database, showing that 189 clones of them had beenregistered in the database as a gene derived from mRNA. All of 178clones accounting for 94% of total clones contained a coding region.Most abundant clones were those encoding ribosomal protein P1 andelongation factor 1-α and 5 clones each were obtained. The 5′-endnucleotide sequence of 5 clones each was all 5′-GCCCTTTCCTCAGCTGCCGC . .. for ribosomal protein P1 and all 5′-GCTTTTTCGCAACGGGTTTG . . . forelongation factor 1-α. These sequences except for “dG” of the 5′ endwere identical with those of clones (Non-patent Document 6) obtainedfrom the library prepared by the conventional method (the DNA-RNAchimera oligo capping method). By comparing the sequences with a genomesequence, it was shown that any “dG” of the 5′ end did not exist in thegenome sequence so that it was confirmed that the “dG” was added duringcDNA synthesis. This was also confirmed by the data that 168 clones outof 178 clones containing a coding region started with “dG”. Furthermore,6 clones started with (dT)ndG (n=1-5). These clones may be produced byfurther addition of multiple “dT” to the added “dG”.

Two clones had not yet been registered in the database as a gene derivedfrom mRNA, but the sequences of these clones completely agreed with apart of a genome sequence and some sequences in EST database. Both ofthe sequences had the added “dG” not existing in the genome sequence.Therefore, these 2 clones are likely to be novel full-length cDNAs thathave not been identified as a gene.

Eleven clones started with the middle of mRNA (cap-nonconsecutive cDNA).Three clones out of these cDNA clones started with “dG” of the 5′ end,but these “dG” were derived from the corresponding mRNA, and clonespossessing newly added “dG” were not observed.

From the above results, 180 clones out of 191 clones carrying a cDNAinsert seem to be full-length (cap-consecutive cDNA) so that thefull-length rate in total is calculated to be 94%. Since 3 clones out of171 clones starting with “dG” of the 5′ end were cap-nonconsecutivecDNA, in the case of this cDNA library, it can be guaranteed withprobability of 98% that cDNA clones starting with “dG” of the 5′ end andpossessing a coding region are full-length cDNA. Especially, the clonesstarting with “dG” not existing in the genome sequence can be guaranteedto almost certainly be full-length cDNA.

Example 3 Preparation of cDNA Library Using Total RNA Derived fromCultured Cell Ht-1080

A cDNA library was prepared by synthesizing cDNA using 5 μg of total RNAprepared from human fibrosarcoma cell line HT-1080 and 0.3 μg of avector primer under the same conditions as described in Example 1(except for omitting decapping reaction), and transforming E. colicells. As a result, a library containing about 10⁵ transformants wasobtained.

The 5′-end partial sequences of cDNA clones in this library wereanalyzed as described in Example 2. With regard to 222 clones whosesequences could be determined, BLAST search using nucleotide sequencedatabase of GenBank showed that 217 clones had been registered in thedatabase as a gene derived from mRNA. Out of them, all of 209 clonesaccounting for 94% of total clones contained a coding region. Of theseclones, 189 clones started with “dG”. It should be noted that thislibrary was prepared from total RNA but not from purified poly(A)⁺RNA.Furthermore, a small amount (5 μg) of total RNA was used. Therefore,using this method, the purification process of poly(A)⁺RNA can beomitted and a full-length cDNA library of high quality can be preparedfrom total RNA of several μg.

Example 4 Large-scale Sequencing Analysis of Full-length cDNA Library ofCultured Cell ARPE-19

A cDNA library was prepared by synthesizing cDNA using 2.5 μg of poly(A)⁺RNA prepared from human retinal pigment epithelium cell line ARPE-19(delivered from ATCC) and 0.7 μg of a vector primer under the sameconditions as described in Example 1, and transforming E. coli cells.The 5′-end partial sequences of cDNA clones in this library wereanalyzed as described in Example 2. With regard to 3683 clones whosesequences could be determined, BLAST search using nucleotide sequencedatabase of GenBank showed that 3662 clones had been registered in thedatabase as a gene derived from mRNA. Out of them, 3474 clonesaccounting for 94% of total clones were full-length cDNA clones. Withregard to these clones, 3069 clones started with “dG” or “(dT)ndG”.

Example 5 Preparation of pGCAP1 Vector Primer

pGCAP1 was prepared using a multifunctional cloning vector pKA1(Non-patent Document 6) as a starting material. FIG. 3A shows a view ofits structure and SEQ ID NO: 1 in a sequence list shows its entirenucleotide sequence. The differences from pKA1 are (1) changing itsreplication origin to pUC19-derived one, (2) adding PacI upstream of arestriction enzyme site HindIII in pKA1, (3) replacing anEcoRI-BstXI-EcoRV-KpnI site with an EcoRI-AflII-Swal-KpnI site. Thefirst nucleotide “A” in the sequence of SEQ ID NO: 1 corresponds to aHindIII site and the 568^(th) does to an EcoRI site.

After 100 μg of pGCAPI was completely digested with 200 U of KpnI, afragment was isolated by 0.8% agarose gel electrophoresis. By adding 375U of terminal transferase (Takara Shuzo) to 70 μg of the obtainedfragment in the presence of 20 μM dTTP and incubating at 37° C. for 30minutes, a dT tail of about 60 nucleotides was added to the3′-protruding end generated by KpnI digestion. Then the reaction productwas digested with SwaI and the longer fragment was isolated by 0.8%agarose gel electrophoresis. This was used as a pGCAP1 vector primer(FIG. 3B).

Example 6 Preparation of cDNA Library Using pGCAP1 Vector Primer

A cDNA library was prepared by synthesizing cDNA under the sameconditions as described in Example 3 using 5 μg of total RNA of humanfibrosarcoma cell line HT-1080 and 0.3 μg of a pGCAP1 vector primerprepared in Example 5, and transforming E. coli cells. As a result, alibrary containing about 2×10⁵ transformants was obtained. When the5′-end partial sequences of the cDNA clones in this library wereanalyzed, it was shown that full-length cDNAs possessing “dG” at its 5′end were obtained and a full-length rate was 95% as in Example 3.

When a pKA1 vector primer is used, addition of one G to the EcoRV-cutend ( . . . GAT) results in generating . . . GATG . . . that contains anew initiation codon ATG. This is not a problem when the purpose is toknow the sequence of the cap-consecutive cDNA, but the presence of extraATG may have a bad effect on correct transcription/translation of thecDNA in the case of using this vector as an expression vector. By usinga pGCAP1 vector primer, this kind of problem does not occur because theaddition of one G to the SwaI-cut end ( . . . ATT) results in thegeneration of . . . ATTTG . . . that does not contain ATG.

Furthermore, when self-ligation was carried out after synthesizing thefirst-strand cDNA and generating a 5′-protruding end by cutting a vectorprimer with AflII, cap-consecutive cDNA possessing “dG” at its 5′ endcould be obtained as in the case of using a vector primer possessingblunt-end SwaI. When one G was added to the AflII-cut end ( . . .CTTAA), . . . CTTAAG . . . was generated, resulting in restoration ofthe AflII recognition site. As a result, cap-consecutive cDNA can be cutwith AflII. Therefore, AflII digestion can be used to determine acap-consecutive cDNA clone.

Example 7 Expression Profile Analysis of Cultured Cell ARPE-19Full-length cDNA Library

A cDNA library was prepared from 5 μg of total RNA of human retinalpigment epithelium cell line ARPE-19 as described in Example 4, and thenthe 5′-end partial nucleotide sequences of the cDNA clones wereanalyzed. With regard to 3204 clones whose sequences could bedetermined, BLAST search using nucleotide sequence database of GenBankshowed that 3038 clones accounting for 95% of total clones werefull-length cDNA clones.

Examining the distribution of the insert size of these full-length cDNAclones showed that the clones contained inserts of a wide range of sizesfrom 0.1 kbp of the shortest one to 10 kbp of the longest one and thatan average length was 1.94 kbp. The long-sized clones carrying a morethan 3-kbp insert accounted for 16% of total clones.

These clones were classified into 1408 kinds of genes. Most abundantclone was glyceraldehydes-3-phosphate dehydrogenase cDNA whose contentwas 44 clones (1.4% of total clones). Only 235 kinds of cDNAs showed anexpression level of more than 0.1%, that is, more than 2 clones eachwere obtained. On the other hand, 971 kinds of cDNAs (69% of totalgenes) were genes whose expression level was less than 0.03%, becauseonly one clone each was obtained. Furthermore, some clones seem to be anovel gene clone of very low expression level, whose sequence agreeswith the genome sequence but has not yet been registered in thedatabase. As shown above, the obtained cDNA library was confirmed to bea low-redundant high-quality library containing a large number of genesof low expression level.

The results of above analysis suggest that the cDNA library prepared bythe method of this invention contains full-length cDNA clones at highrates and truly reflects the expression level of mRNA expressed in cellsbecause of no bias of gene length or its expression level. Therefore,this method is effective not only to obtain full-length cDNA clones butalso to analyze the expression profile of genes expressed in the cells.

Example 8 Preparation of pGCAP10 Vector Primer and Preparation of cDNALibrary Using This

pGCAP10 was prepared using pGCAP1 as a starting material. FIG. 4A showsa view of the structure, and SEQ ID NO: 2 in the sequence list shows itsentire nucleotide sequence. The difference from pGCAP1 is aSwaI-EcoRI-FseI-EcoRV-KpnI site produced by replacing anEcoRI-AflII-SwaI-KpnI site. As described in Example 5, a dT tail ofabout 60 nucleotides was added to the 3′-protruding end of KpnI inpGCAP10. After digesting the reaction product with EcoRV, a longerfragment was used as a pGCAP10 vector primer (FIG. 4B). Using 0.3 μg ofthis vector primer and 5 μg of total RNA of human fibrosarcoma cell lineHT-1080, the first-strand cDNA was synthesized under the same conditionsas described in Example 3, and then the end of the vector side wasconverted to a 5′-protruding end by EcoRI digestion. Afterself-ligation, the ligation product was used for transformation of E.coli cells by omitting the process replacing an mRNA strand with a DNAstrand. As a result, a cDNA library containing about 10⁶ transformantswas obtained. With regard to the cDNA clones in this library, analysisof the 5′-end partial sequence was carried out, showing that full-lengthcDNA added by “dG” at the 5′ end was obtained and a full-length rate was95% as in Example 3.

INDUSTRIAL APPLICABILITY

As described in detail above, by the invention of this application,full-length cDNAs that are guaranteed to possess a consecutive sequencestarting with the nucleotide of a transcription start site can besynthesized from total RNA of one to several μg at high yield of morethan 90% by small processes not using PCR. As a result, information onthe primary structure encoded by the gene and information on theexpression regulatory region controlling the expression of the gene canbe robustly obtained so that this invention greatly contributes to notonly the effective use of genome information but also the production ofrecombinant protein useful in medical fields and the like.

1. A method for constructing a DNA vector having a cDNA synthesized froman mRNA, which method comprises the steps of: (i) annealing adouble-stranded DNA primer and an mRNA mixture, wherein thedouble-stranded DNA primer consists of a first strand having a primersequence and a second strand, and wherein the double-stranded DNA primercontains a replication origin or both a replication origin and apromoter for cDNA expression, (ii) preparing an mRNA/cDNA heteroduplexby synthesizing a first-strand cDNA primed with the primer sequence ofthe first strand of the double-stranded DNA primer using reversetranscriptase, (iii) circularizing the mRNA/cDNA heteroduplex byligating the 3′ end of the first strand cDNA to the 5′ end of the firststrand of the double-stranded DNA primer using T4 RNA ligase to form acircular mRNA/cDNA heteroduplex, and (iv) replacing the RNA in themRNA/cDNA heteroduplex with a second-strand cDNA by synthesizing thesecond-strand cDNA with a DNA polymerase, thereby constructing the DNAvector having cDNA consisting of the first-strand cDNA and thesecond-strand cDNA.
 2. The method of claim 1, wherein the mRNA iscontained in a cell extract.
 3. The method of claim 1, wherein the mRNAis synthesized by in vitro transcription.
 4. The method of claim 1,wherein the primer sequence of the double-stranded DNA primer contains asequence complementary to a partial sequence of the mRNA.
 5. The methodof claim 1, wherein the primer sequence of the double-stranded DNAprimer contains an oligo dT complementary to a poly(A) sequence of themRNA.
 6. The method of claim 1, which comprises the following stepbetween the step (ii) and the step (iii): (ii′) generating a5′-protruding end or a blunt end at the terminal of the double-strandedDNA primer by cutting the mRNA/cDNA heteroduplex using a restrictionenzyme.